Object representing and processing method and apparatus

ABSTRACT

Methods of multilevel mark and multilevel mark code are provided in this invention. The Methods can realize encoding of various objects, encoding of structures and interrelation of the objects. The methods can distinguish various object codes, make the encoding resources inexhaustible, and solve problem of various resource sharing. The present invention provides the related processing method, for examples, object inputting, outputting, searching, and etc. The present invention also provides related apparatus. The methods and apparatus provided by this invention can be broadly used in various regions in the world, in various fields, in various software and hardware.

TECHNOLOGY FIELD

This invention is related to methods and apparatus of objectrepresentation and processing. Objects are represented by codes ormarks, the methods and apparatus of object processing are based onobject representation.

BACKGROUND OF THE INVENTION

All codes used now, such as ASCHII, GB2312, GBK, BIG5, Unicode,international 10646 and etc., belong to non-multilevel mark codes; themultilevel machine codes. I proposed in CN 1122476A (ZL 94114104.7)andCN 1182234A (ZL96115997.9), also belong to non-multilevel mark codes.Singular machine codes and multilevel machine codes all belong tonon-multilevel mark code.

DETAILED DESCRIPTION OF THE PRESENT INVENTION

First, this invention presents an object coding method called multilevelmark code method; second, presents an object coding method for multipleproperty object called multiple property code method; third, presentsmethod of object input, output, and searching for objects, especiallyfor coded input objects. In the study of object coding, the mostimportant problem, or kernel problem, is how to distinguish differentobject codes and different classes of codes in a code sequence, theproblem can be solved by marking mark bits of object codes; in practice,mark is also a kind of code. So fourth, this invention presents a methodof object processing by multilevel mark. In the marking procedure ofobject sequence, various matrixes for tree structure have been naturallyformed; this leads to the fifth: the matrix processing method of treestructure. The apparatus related of the methods above are the sixthaspect of this invention.

The Definition of Object

Object is something that perceptible by one or more of the senses,especially by vision or touch; object is perceptible thing. Image andtext are object, and perceptible; voice is perceptible; the vibration ofmobile phone is perceptible by feeling. Object can have a kind ofstructure. One object may consist of a lot of child objects. Object canbe static or dynamic. Object can possess properties, behaviors oractions. The world consists of varieties of objects; each object mayinclude data and program; program describes a serial actions or methodspredefined and executable, and data describes properties or structure ofobject.

For examples, character, sentence, article, field, data base, mathsymbol, formula, expression, function, figure, image, photo, picture,music symbol, song, scene and movie, all are objects with information.Law of physics, symbol of chemistry, symbol of enterprise mark, programflow, experiment demonstration, data structure, mode; all are object.

Knowledge also is object, rule is also object.

Object can be one dimensional or multi-dimensional.

Object can be integral, or part of an integral object, for example, acartoon and an actor of the cartoon; an actor and parts of the actor.

Object can be represented by code or mark, code or mark is also object.Mark and code can be changeable.

Remembering of something or counting can use knots of a rope; this is anancient coding or marking method.

In digital times, mark and code can be realized by digital. The simplestmark can be binary 0, 1; the simplest code is also binary 0, 1.

Multilevel mark and multilevel mark code is a totally new marking andcoding method.

1. Multilevel Mark Code (MMC)

The method of MMC is based on multilevel mark in section 4, in order tounderstand this section clearly, please read section 4 first.

Most popular code consists of binary bits.

There are two main difficulties in object coding. First, how todistinguish different kinds of codes, taking text as an example, somecharacter code is one byte long, such as English; some character is twobyte long, such as Chinese; how to distinguish different kind of codesin same code length, how to distinguish codes in different kind and indifferent code length? The second is how to solve the coding spaceproblem, for example, only 64 k space for 2 byte coding.

MMC is realized by multilevel mark. Because no code length limited forMMC, so the coding space is unlimited, and any objects can be coded byMMC. Because multilevel mark is used in MMC, it is very easy todistinguish different codes and to distinguish different class of codesin a code sequence.

Object is encoded by multilevel mark code; and the encoding ofmultilevel mark code comprising the steps of:

-   -   (1) object code consisting of code segments, and the code        segment consisting of binary bit;    -   (2) selecting mark bit in each segment;    -   (3) marking mark bits of the code segments by multilevel mark.

Code segment can be byte; for example, the code of English letter is onebyte, the code of Chinese character is 2 bytes, or more bytes.

Multiple segments refer to two or more segments.

Selecting corresponding bit in each segment as mark bit, “correspondingbit” means that the position of mark bit in each segment is same, forexample, the first bit in each byte as mark bit.

Here, the A mark or B mark in multilevel mark is binary 0, or 1.

The code encoded by multilevel mark (MM) is called as multilevel markcode; the code not encoded by multilevel mark is called asnon-multilevel mark code. Therefore multilevel machine code or singlemachine code all belongs to non-multilevel mark code. There aredifferent kinds of MMC, such as linked MMC, embedded MMC, combined MMC,inter mark MMC and etc.

Simple object can be represented by one MMC, complicate object can berepresented by a set of MMC, and a set of MMC can have a definitestructure.

MMC encoding makes it becomes possible that text, data, pointer andvarious objects can be processed in a code sequence.

The most popular coding segment is byte.

MMC can be used to execute operations of related object, for example,characters represented by MMC can execute operation about characters;matrix represented by MMC can execute matrix operation, and so on.

In the coding of MMC, object with high frequency can be coded in shortMMC. Data or pointer can be represented by MMC, called MMC data or MMCpointer respectively, the length of MMC can be decided by the data rangeof the object. Various data structure can be represented by MMC, forexample, list, tree structure, matrix and etc. The data in datastructure can be represented by MMC. MMC can also be used to representthe relation among objects.

1.1 MMC for Different Class Objects

In order to distinguish different class object conveniently, set classsegment in MMC. dividing MMC into two parts: class part and coding part;the information in class part can tell if the next segment (or segments) is class segment or coding segment (or segments); assuming the numberof segments of a MMC is N, if N>2, then the information in class segmentcontains the code class information or of the coding class of the nextsegment (or segments ); if N=2, then the information in class segmentcan tell the coding class of the next segment. The class segment can beone or multiple segments; also can be a part of a segment.

As scanning the MMC along one direction in MMC code sequence, if thecode change from one class to another class, then put MMC with classsegment related to the object class; later if the class of MMC notchange, then the class segment of MMC can be omitted. If it is a defaultclass of MMC for a definite length of MMC, then the class segment of MMCcan also be omitted.

FIG. 1 illustrates a 2 byte MMC with class segment, the MMC uses Right 0MM, the first byte is class segment, the mark bit of MMC is the firstbit of each byte, the value of the mark bit of class byte is 1, and thevalue of the mark bit of the second byte is 0. The class byte explainsthe coding class of the second byte, for example, code of Englishletter.

FIG. 2 illustrates a 3 byte MMC with class segment, the MMC uses Right 0MM, the value of the mark bits is 1, 1, 0, respectively; the first byteis class byte, it can tell if the second byte is class byte, or tellsthe class of the code consisted of the second and the third byte. If thesecond byte is a class byte, then it tells the code class of the thirdbyte. If the class part of MMC can't be represented by one segment, itcan use multiple segments; if the amount of class of a kind of MMC isless, the class part of MMC can use part of a segment.

In order to save space, if the class of a MMC in a code sequence can berecognized, the class part of the MMC can be omitted. If scanning fromleft to right, the class part can be omitted, but as scanning from rightto left, the class of MMC might not be recognized. Therefore, for doubledirection scanning, keeps the class segment in the previous MMC beforeclass changing and the first MMC class changed.

1.2 The Transformation Between Non-MMC and MMC

Method of transformation between non-MMC and multilevel MMC can make MMCcompatible with Non-MMC.

The transformation non-MMC into MMC is called multilevel directiontransformation, the transformation MMC into non-MMC is called singledirection transformation.

The transformation depends upon the relation between non-MMC and MMC,the relation is: assuming the non-MMC of one kind of object consistingof N bytes, N>=1, among them if there are M bytes without mark bit, thenthe codes of the objects can be represented by 2^(M) class MMC, and add1 class byte, there are 2^(M) different value, each related to one of2^(M) class MMC; if there are J kind of non-MMC, each of them can berepresented by K₁, K₂, . . . K_(j) respectively, then amount of allclass isS=K ₁ K ₂ + . . . +K _(j)

If 1 byte can't have enough space to represent all of them, add moreclass byte to the class part of MMC.

The said multilevel direction transformation comprising steps of

Recognizing the class of non-MMC;

Selecting the class part of MMC according to the class of non-MMC andits related mark bit value of the multilevel mark code;

Taking non-MMC as the coding part of MMC, and making the mark bitsaccording to the coding requirement of MMC;

Combing the class part and the coding part as MMC of the non-MMC; theclass part can be omitted if the class of MMC can be known by thecontext; if the length of MMC is equal to the length of a kind of MMCwithout class part, adding another class part to distinguish them.

The said single direction transformation comprising step of

According to the relation of class byte and related non-MMC, taking thecoding part of MMC as the non-MMC; and resuming the mark bit value. Ifwith class byte the MMC with class byte.

In the coding of non-MMC, usually take some bit of a byte as mark bit,for example, ASCII is 1 byte code with the first bit 0 as mark bit, theGB2312 for Chinese character is 2 byte code, the first bit of each byteis as mark bit with value 1, 1. For GBK of Chinese character code, thefirst bit of first byte is as the mark bit with value 1, and no mark bitin the second byte, this means that any bit in the byte can be 0 or 1.The mark bit of non-MMC influences the class part of its related MMC.

First, taking 1 byte non-MMC as an example, ASCII is 1 byte code withthe first bit 0 as mark bit, so the MMC of ASCII can be represented by 1class of 1 byte MMC. If 1 byte non-MMC without mark bit, i.e. the firstbit can be 0 or 1, then it can be represented by 2 class of 1 byte MMC.In this case, add 1 class byte before it; as right 0 MM adopted, thefirst bit of the class byte is 1, the first bit of the second byte is 0,as illustrated in FIG. 1. For 2 byte MMC, the class byte can represent128 class codes, and the second byte is the coding part. To save space,only the coding byte for same class MMC can be saved. If the coding byteis changed as multilevel direction transformation, non-MMC can berecovered.

Second, analyze 2 byte non-MMC. Taking Chinese as an example, GBK can bedivided into two classes: one class with the first bit for both bytes is1 (GB2312 is related to this class); another class with the first bit offirst byte is 1, and the first bit of second byte is 0. Therefore, GBKcan be represented by two kind of MMC, each related to one class above.Similarly, Big 5 can also be represented by two classes of MMC. In orderto distinguish different classes of MMC, 1 class byte can be addedbefore 2 byte code. The class byte can tell if the second byte is classbyte, or which class of the 2 byte code; if second byte is class byte,the second byte can tell what class of the third byte code is. Refer to3 byte MMC in FIG. 2

Similarly, to distinguish the classes of 3 byte non-MMC codes, 4 byteMMC can be used; and so on. Therefore, if different language charactersmixed in a code string; usually, MMC is 1 byte longer than originalcode, because 1 byte class byte added. Obviously, the coding resource isunlimited by MMC.

Scanning a code string from left to right,. as code class changes, thefirst changed code should prefixed with class byte, and later on, in acode section, if the code class kept no change, then class byte of thecodes in the section can be omitted.

To make bi-direction scanning possible, the first MMC after code classchanged, and the last MMC before code class changed should be prefixedwith code class.

In a code string, if the codes with a definite code length are defaultclass of codes, the default codes can be without code class.

The codes in a code string with different kind of codes can bedistinguished by outer multilevel mark.

The coding idea of Unicode is contrary to Huffman coding, because thecode length of Chinese characters hardly be used is as same as thelength of often used characters, and also as same as the length ofEnglish letters.

The distinguishing of different class of codes in Unicode is by thecodes in code space, not by marking method.

The length of Unicode (UTF16) is 2 bytes. The length of MMC for 1 bytecharacter codes is little more than 1 byte; because the class byte canbe omitted most times, the length of MMC for 1 Chinese character codesis little more than 2 byte; because very rare the MMC with class byte,according to the frequency of GB 2312 in GBK, the average length of MMCfor Chinese character codes is about 2.0002 bytes; so comparing languagecodes related to Unicode, the length of MMC is far less 1.5 byte inaverage.

If MMC is different to its related original codes, in order todistinguish different classes of codes, original codes should betransformed into MMC as in main memory, however in the media outsidemain memory, the codes can be stored by its original codes. However,taking Chinese characters an example, in order to keep phrasesegmentation information, MMC should be stored not only in main memory;but also store in other store media; or store original codes and markbits of MMC separately.

MMC is consistent with GB18030, and easy to transform into each other.

Phrases of different languages can be represented by MMC.

Next, take some examples of MMC of other objects.

The math symbols can be represented by MMC, and edited just ascharacters; for a math expression, it is easy to get the operationresult by scanning the expression, and taking related operations.

One math symbol relates to one image in the font library, and relates toone MMC. One can input symbols just like inputting characters by theinput codes of the symbol, the input codes can related to its name orpart of its name.

The processing of math symbols is just like characters, for example, tochange the size and color, edit or modify.

Similarly, music symbols, image icon, video frames, data structure canbe represented and processed by MMC. The coding method of MMC can beused to various objects. Because most common used languages are letterbased, such as English, the code is one byte and with the first mark bit0, so MMC can be coded by multilevel mark with group representative 0,and because the mark bit of the first byte of Chinese character codes,such as GBK and Big5, is 1, so the coding of MMC is easy to beconsistent with original codes by Right 0 multilevel mark.

Object group can be encoded by MMC, this is very important for grouplevel processing; for example, to realize Chinese information processingin phrase level. Object group can be represented by MMC.

1.3 Different Kinds of MMC

There are many type of MMC, such as linked MMC, embedded MMC, combinedMMC, inter mark MMC, and so on.

In object processing, taking at least one of the following multilevelmark code:

Linked MMC is related to an object in the data source by its location.

Embedded MMC is embedded together with its related object, and coded thecode range inside.

Combined MMC combined the object codes according to MMC coding method.

Inter mark MMC located in object code string and coded the objectrelation information inside.

For linked MMC, if only one object in the data source, the location ofdata source is related to the MMC; if more than one objects, then linkedMMC related to the position the object in the data source. Here, oneobject means the object is only related to one MMC; however, the objectcan be any object, and can be a combination of multiple objects. LinkedMMC can relate outside object, or inner object, here, inner or outsiderefer to inner or outside of an object sequence.

Linked MMC can be nested, for example, a MMC for phrase with more than 1character, the MMC is related character codes of the phrase, and each ofthe character codes is related to a font in the font library.

The boundary or range information is included in Embedded MMC, theboundary information can be size, or length, or range of object in theobject sequence, and etc.

The objects to be embedded can be various objects, also can be objectcode.

Embedded MMC can be used to distinguish object information, for example,text information, object method information, such as media player.

An embedded MMC can relate to a file, and a code sequence with EmbeddedMMC can relate to a file. In order to be able to scan the code string inbi-direction, two embedded MMC can be put in the embedded object block.An embedded MMC can be represented by two MMC, Embedded MMC can benested.

Embedded MMC can be transformed to linked MMC, vise versa.

Inter Marking MMC is just like marking an object sequence among theobjects by MMC. For example, just like the marks in text, marks in themarking language, such as XML, and marks in formatting text, such asRTF.

As an example, the marks in marking up languages and formatting text, isa kind of marks, here called as letter grouped marks, such as<head>,<body>,<table width=#or%>, and etc. There are two kind of formsto set up inter marking MMC, one is to create a set of new markingsymbols and its related codes; another is consistent with the marks nowused. For the later form, MMC can be used to represent the groupedletter marks, such as <head>, <body> and etc. Then set up a relationtable between letter grouped marks and MMC, and the MMC can be inputtedjust like the character inputting, and the MMC can be displayed as therelated letter grouped marks. Different input table can be set up withdifferent input methods.

The marks of various format text and the marks of various programlanguages can be presented by the similar method.

Combined multilevel mark code consists of multiple object codes groupedaccording to MMC coding method. First confirm the object codes to begrouped, segments of each code and the mark position of each segment,then mark the mark position by multilevel mark.

Taking an example, in the code system of GB2312, each code of acharacter is 2 bytes, and the mark bits are first bit of each byte, andwith value 1. The mark position can take the mark bit of any byte of thecharacter, as an example, the first byte is selected, then the mark bitsof combined multilevel mark code of 2 character phrase are 11 01, for 3character phrase are 11 11 01.

For MMC of Chinese character, the mark position can take the mark bit ofeach byte of the character, the mark bits of combined multilevel markcode of 2 character phrase are 11 10, for 3 character phrase are 11 1110. The grouping of English words can have different choice.

Because the combined MMC is for a group of some of the original codes;comparing the limited amount of characters in a language, the amount ofphrases is unlimited; the resource of combined multilevel mark code isalso unlimited.

Two coding method of combined multilevel mark code for Chinesecharacters are as following.

If the MMC of Chinese character is 2 byte long, then take the grouprepresentative of each character of a phrase as the mark position; markthe mark position by the same kind of multilevel mark as the MMC of thecharacters.

If the code of Chinese characters of multiple character phrase is 2 bytelong and with mark bit in one byte of the two, such as GB2312, Big5,GBK, then take one of the mark bit of each character as mark bit, andmark the mark bits by multilevel mark.

Similarly, object groups can be represented by multilevel marks. Thetransformation method between the mark bits of MMC of an object groupand the multilevel mark of the object group is as following. Selectingthe mark bits of the MMC, and store the mark bits in the multilevel markstorage space, then transformed into multilevel mark; replacing therelated mark bits in the related object group by multilevel mark, thenmultilevel marks transformed into combined MMC.

General speaking, the code resource is enough if the phrases of onelanguage are coded by 3 byte or 4 byte MMC (here not refer to combinedMMC), it is recommended to code the often used phrases shorter, and notoften used phrases longer.

1.4 Sequential Objects

A sequence consisted of N nodes, N>=1, each node related to an object,which is called node object, node object consisted of M sub objects,M>=0; then, node object and sub object can be represented and processedby MMC.

Sub object can consist of sub objects.

If M=0, taking an example, a movie consists of segments, such as scenesor frames; each segment can relates to a linked MMC, and segment caninclude text information, the text information may include multiplelanguages. The linked MMC related to movie can be edited or modifiedjust like text editing or modification. The MMC sequence can be playedby a player. This makes movie editing easier.

In above example, MMC can relate to multiple movies, or one movie; a setof MMC can reflect the level relation of movies and their segments. Forexample, the first level, MMC points to movie; the second level, MMCpoints to segment; the third level, MMC points to lower segment, andetc.

If M>0, taking an example, one node relates to a scene of an activeobject, which relates to a MMC, sub objects of a node object relates toa group of MMC , such as actors in a cartoon, the parts of the actor.Assuming an actor consists of two parts, the body and its arm; and thebody and one arm kept no change during the actor moving; then the bodyand the arm can be represented by a MMC, and the changed arm can berepresented by multiple MMC, each relates to a form of the arm. Thus ifN>1, the related MMC can represent the moving actor. If N=1, a staticactor.

Digital man, music, and song can be represented by above method.

Linked MMC can act as an outer mark of objects by MMC. Linked MMC pointsto the object in the data source, such as column or row in table; andpoints to function, formula, paper, dictionary, and etc. . .

Linked MMC is similar to link in OLE, and character code to fontlibrary. MMC can have the characteristic of layers, and can be editedand modified in text.

Linked MMC can be divided into two classes, the first relates to linkingto outside objects of an object sequence, and the second relates toinner objects. The first class linked MMC points to the object in thedata source outside the object sequence; the second points to the objectinside the object sequence.

As an example of embedded MMC, a movie consists of segments, there is atleast one embedded MMC related to a segment, locating in front of thesegment, or after the segment. The embedded MMC includes the informationabout the segment, such as the length of the segment, the textinformation of the segment, icon and play program of the segment. Theembedded MC sequence can be scanned forward if the embedded MMC in frontof the segment; scanned backward if the embedded MMC after the segment.The MMC sequence can be scanned bi-direction if there are two MMC foreach segment, one in front of and one after the segment.

The embedded MMC for a movie can be edited or modified as text, andplayed by player. The embedded MMC for multiple movies can be edited ormodified as text. The class part of the MMC can be used to distinguishdifferent movie. Embedded MMC is similar to embedding in OLE, but moreflexible and more convenient.

Binary information, such as communication information, can be marked notonly by multilevel marks, but also by MMC, and MMC can include errorinformation, so can be used to correct errors made in transmission.

A binary sequence can be grouped into segments according to some rule,then the segments can be marked by MMC, and the MMC includes someinformation of the related segments.

1.5 Data Compression by MMC

MMC can be used to compress data by one or more operations in thefollowing:

-   -   (1) representing the repeating times by MMC; If compressing        repeated objects in an object sequence,    -   (2) representing Objects in high frequency by shorter MMC;    -   (3) representing Objects in low value by shorter MMC    -   (4) Setting up an object group library outside the object        sequence, and representing the related object group in the        library in the sequence by MMC;    -   (5) Setting up an object group library inside the object        sequence, and representing the related object group in the        library in the sequence by MMC.

MMC can be used to various data compression, related to objects outsidesequence or inside sequence.

In compression, the length of MMC can be selected according to the datavalue; MMC can be used as pointer to link the compressed information.the length of MMC can be selected according to appearing frequency inorder to decrease store space; MMC can be used to represent thecompressed information. Select proper segments of the MMC incompression.

It is very important to select proper length of MMC, relative lessmarking bits by multilevel marks if the length long, and relative moremarking bits by multilevel marks if the length short. MMC can havecoding segment with different length.

In order to solve too many mark bit problem in MMC, MMC can be used asinner mark, in which the coding segment length and amount of segmentsincluded. For example, the coding segment is 2 bits, 10 segments, andetc.

Taking Chinese text as an example, the frequency of the 127 Chinesecharacters with highest frequency is 44.9683%. If these characters arerepresented by 1 byte MMC, then the compression rate=45/200=22.5%;because one Chinese character takes 2 bytes, 100 Chinese characters take200 bytes; there are 45 MMC in 100 Chinese characters, each MMC with 1byte.

The compression rate will be higher, if phrases of multiple characterswith higher frequency are represented by MMC.

In text compression, phrases with higher frequency can be represented byshort MMC, longer MMC for lower frequency.

For 7 bit coded letter, such as English letter, the spaces between wordscan be compressed. For example, the first bits of each letter of a wordcan be marked by right 0 multilevel marks, and then the spaces can beomitted.

The amount of object group libraries outside sequence can be greaterthan 1.

MMC can be used to realize compression similar to RLE (run lengthencoding) . This kind compression can be called repeated object countingcompression, MMC is used to represent the repeated character and therepeated times.

MMC can be used to realize compression similar to LZ-77. The informationappeared. again later can be represented by MMC.

As MMC used to realize outside dictionary compression, set up an outsidedictionary, MMC relates to item of the dictionary; in the compressedtext, the text related items of the dictionary are represented byrelated MMC.

As MMC used to realize inner dictionary compression, the length of MMCcan be varying, and MMC can include information related to compressedtext information, such as location, length of information. Therefore, arelated dictionary can be set up, the length of items of the dictionarycan be different, and the amount of items of the dictionary is notlimited. This is better than LZW method.

The compression methods can be mixed.

For data compression, such as the compression of image and voice; thedata can be divided into frames, which can be represented by embeddedMMC, the data inside the frame can be represented by MMC, no matter thedata is in time zone or in frequency zone. The repeated frames in framesequence can be represented by MMC. The frames appear again latter canbe represented by MMC. The latter frame can be represented by thedifference delta between the frames.

The objects appearing sequential in a space can be represented by MMC,for example if B object appearing after A object, then B object can berepresented by the difference between A and B object.

Data structure and data can be compressed and stored separately, andcompressed by different compression methods.

In various program languages, the data length is definite for a specifictype of data, no matter the size of data is bigger or less; however, thedata length can be flexible as the data represented by MMC.

If the length of integer type can vary, then store space can be saved.For example, taking 1 bit of each byte as marking bit, then 7 bits of 1byte can be used to represent data, if 1 bit used as symbol to representpositive or negative data, then 6 bits can be used to represent the sizeof the data. Therefore, 16−2−1=13 bits can be used to represent datasize for 2 byte integer, 24−3−1=20 (bits) for 3 byte integer, and so on.Data and pointer can be represented by MMC, which is called as MMC dataand MMC pointer respectively, the length of MMC is depended on the valuerange of the objects being represented by the MMC.

The data coding steps are as following:

According to the size of the data, whether symbol bit needed, and whatthe representing form of the data; select proper code segment, and theamount of segments, select the mark bit, then encoding the data orpointer by marking the mark bits by multilevel mark.

The objects with higher frequency can be represented by short MMC.

The coding steps are as following:

(1) Selecting proper code segment with N bits;

(2) Coding the objects with high frequency by 1 code segment, N−1 bitsfor value, the mark bit marked by multilevel mark;

(3) Selecting 2 code segments, taking (2N−2) bits of the 2 segments tocode the less high frequency objects, and the mark bit marked bymultilevel mark;

(4) Selecting more segments if necessary, repeating the coding processuntil all objects coded.

2. Multiple Property Code of Objects

As everyone knows, there are multiple properties for Chinese characters,such as font, pronunciation and meaning. However, a Chinese charactercode is only related to its grapheme, no matter its pronunciation ormeaning. There are a lot of Chinese characters with multiplepronunciations. For a polyphone character, it is hard to distinguishwhat its correct pronunciation, because the code of the character isrelated to multiple pronunciations. Similar situation happens forcharacter meaning. Similar situation also happens in other objects. Forexample, a video can have multiple properties, such as voice, text andimages. A movie can relate to multiple properties, such as one movierelated to multiple languages, a drama may relate to different music ordifferent performance. A book can relate to different translationversion. An operation system relate to different language version and soon.

Object code is often encoded by one of its property, and one propertymay relate to multiple characteristics (the meaning of “characteristic”is as same as “property”, here only used to distinguish in differentlevel). The object with multiple properties is called as multipleproperty object. Generally speaking, it is hard to distinguish thecorrect properties for multiple property object. This is the reason tointroduce multiple property code. It is impossible without enough coderesource to do so. However, it is easy by the method of multilevel markcode, because MMC method presents unlimited code resource.

As it is hard to distinguish what the correct properties for a multipleproperty object, the object can be encoded by multiple codes, eachrelates to a property, this kind of code called as multiple propertycode.

for a multiple property object, as the object code not directly relateto one property, adding a multiple property code to relate to thisproperty, a multiple property object in an object group is encoded bymultiple property code relative to correct property. This kind of codecalled as multiple property code.

The above method can deduce a multiple property code method forpolyphone Chinese characters:

For a Chinese polyphone character with N pronunciations, add adding(N−1) multi-property codes to the character, so there are N codes forthe character, each relates to one pronunciation; a polyphone characterin a phrase is represented by its multiple property code with correctpronunciation.

A method to solve the problem of Chinese polyphone is presented inChinese patent No. N 1182234A (ZL 96115997.9). To solve the polyphone inphrases, the method uses phrase phoneme library. However, the new methodof solving the problem of Chinese polyphone in phrases is not by phrasephoneme library, it is by multiple property codes of the polyphonecharacters. The new method can save space and operation time. There areabout 830 polyphone characters in GB2312, which includes 6763characters. Most of the polyphone characters are often used characters.There are 5 polyphone characters in the first ten high frequencycharacters, with frequency 4%; they are

Phrase with multiple characters may be with different pronunciations,for examples:

(ha

shi),

(haóshi) and etc.

There are about 1300 phonemes in Chinese character pronunciations; onephoneme is related about 5 characters in GB2312.

Next is the method of distinguishing pronunciation of polyphonecharacter by multi-property code;

The method comprises at least one of the steps of:

To distinguish Chinese polyphone character with N differentpronunciations by adding (N−1) multi-property codes; the saidmulti-property code of polyphone character with the following mark bit:the mark bit with mark bit 1, 1, if the original code of the polyphonecharacter belongs to GB2312; with mark bit 1, 0 if the original code ofthe polyphone character belongs to GBK but not in GB2312, i.e. theoriginal mark bits are 1, 0.

The advantages of above method is to make the multi-property codes ofChinese polyphone characters consistent with Chinese character codes usenow, so the code length of MMC is short for often used, and longer forrarely used. In reference CN 1182234A (ZL 96115997.9), only differentmark bit was pointed; However, the relation between code length andfrequency was not considered.

Polyphone character can be represented by MMC; and Polyphone phrase canalso be represented by MMC.

The method above makes the coding of Chinese character code, Polyphonecharacter code and MMC consistent.

Chinese full pronunciation is a kind of marking to Chinese characterpronunciations.

Chinese full pronunciation consists of two bytes, in which two letters,each letter 5 bits; and 5 pronunciation notes, 3 bits.

The phoneme information of a Chinese character code without noteinformation can be represented by two letters. If the sequence of twoletters is sequenced according to the related pinyin sequence of Chinesecharacter codes, for example, pinyin “a” is represented by two letter“aa”, pinyin “ai” is represented by two letter “ab”, pinyin “an” isrepresented by two letter “ac”, and so on. The relation between pinyinand its two letters is called pinyin-two letter mapping table. The fullpronunciation above is called as sequenced full pronunciation. Thesequenced full pronunciation of Chinese character codes is one form offull pronunciation.

Chinese sequenced full pronunciation consists of two bytes, in which twoletters, each letter 5 bits; and 5 pronunciation notes, 3 bits; the sizesequence of the two letters are selected according with the sequence ofChinese syllable represented by Pinyin (or phonetic notation); the valueof the 3 bits is selected according with the note sequence in thestandard dictionary.

The method above can be used in Zhuyin, which is a marking form ofChinese character pronunciation, which is still used in Taiwan.

In Chinese patent CN 1182234A (ZL 96115997.9), the concept of fullpronunciation is presented, but not the concept of sequenced fullpronunciation. The difference is that sequenced full pronunciation canbe used in comparison between Chinese character strings; however, fullpronunciation can't be used in comparison between Chinese characterstrings.

According to the sequence of a Chinese character coding system, set up asequenced full pronunciation table, each item of the table is thesequenced full pronunciation of the character related to the codingsystem; then, because the sequence of the sequenced full pronunciationin the table is consistent with the sequence of the character codes inthe coding system, so for a given character, it is easy to get therelated sequenced full pronunciation by the table. In another words,Chinese characters can be transformed into its sequenced fullpronunciation by the table. Different phoneme representing of Chinesecharacters can be transformed each other, sequenced full pronunciationcan be transformed into Pinyin, Shuangpin, or Zhuyin.

Sequenced full pronunciation table is the table of sequenced fullpronunciations of Chinese character codes the sequence of sequenced fullpronunciations is consistent with Chinese character code system.

Set up a table called pinyin-two letter mapping table, to mapping theeach two letter into its pinyin; then, by sequenced full pronunciationtable, a sequenced full pronunciation can be transformed into its pinyinletters and its note.

Various Chinese phonemes can be transformed each other, such as Pinyin,Shuangpin, Zhuyin, or sequenced full pronunciation. According to therelation among Pinyin, Shuangpin, Zhuyin, set up relation tables torealize the transformation.

According to the relation between the Chinese character codes and thesequenced full pronunciations, the phonemes can be transformed from theChinese character codes by the following steps:

-   -   (1) According to the sequenced full pronunciation table,        generating the sequenced full pronunciation of the related        Chinese character code;    -   (2) According to the relation table between the two letters and        its related pinyin (or related Zhuyin), generating the pinyin        (or Zhuyin) and its note.

Next apparatus can be used to transform Chinese codes into its speech.

-   -   (1)□voice library apparatus; used to store the voice of the        related sequenced full pronunciation;    -   (2)□index apparatus: used to store the index of the voice        library apparatus, for the related sequenced full pronunciation;    -   (3)□transformation apparatus: according to the index in index        apparatus and sequenced full pronunciation, calculating the        voice location of the sequenced full pronunciation in the voice        library.

According to the relation table between Chinese character code and itssequenced full pronunciation, outputting the voice of the Chinesecharacter by the following steps:

-   -   (1) Generating the sequenced full pronunciation of the character        according to the relation table between Chinese character code        and its sequenced full pronunciation;    -   (2) Retrieving the location of the voice related to sequenced        full pronunciation in voice library by index apparatus of the        sequenced full pronunciation;    -   (3) Retrieving the voice in the voice library by the location.        3. Coded Input Object Processing Methods

In object processing, such as input, output, transformation, andtransferring and etc; the most important is how to representing theobjects. For examples, the objects can be represented by MMC, or bymultiple property codes. Object processing is closely related to objectrepresentation.

In a MMC sequence, the class of a MMC can be distinguished by its classpart. For display of text, according to the class of the character,retrieving the font in the related font library, display the font in thescreen. For a nested MMC, such as a MMC of multiple character phrase,first retrieve the codes of the characters by the MMC, then retrieve thefont in the related font library of the character codes, and display thefonts in the screen. In text to speech, first to distinguish the codeclass according to class segment, then retrieve the phonemes of therelated characters, and then retrieve the voice data in the voicelibrary related to the phonemes. In object inputting, according todifferent class of objects, different input method, retrieve MMC ofobject, or multiple property codes or multilevel mark.

Object inputting can include various objects, text is most commonobject, and other objects can be image, video, music and etc.

The input codes of objects are codes related the objects for inputtingthe objects. Object codes are used to object processing. As inputtingobject, input the input codes of the object, and then transform theinput codes into its object code in the machine. In order to distinguishinput codes and object codes, input codes are also called as outercodes, and object codes are also called as machine codes. In thisinvention, the input codes of object can not only used in inputting,also can used in object searching, outputting and other objectprocessing.

The methods of object inputting can be divided into two classes:no-coded input method and coded input method. For a given kind keyboard,if the amount of characters in the code system is less than the amountof keys for inputting characters, then press one key can input acharacter; this is called no-coded input method, because it doesn't needto code more keys to input a character. The inputting of English belongsto this kind method in computer keyboard. For a given kind keyboard, ifthe amount of characters in the code system is more than the amount ofkeys, then press one key can't input a character; in this case, in orderto input the characters, it does need to code more keys to input acharacter. In this case, an input code table is set up; the tablecontains the relation between object codes and input codes. This kindmethod is called coded input method. For example, the input method ofChinese characters belongs to this kind method in computer keyboard.Another example is the inputting of English in the telephone, or handphone; because the amount of keys in this situation is less than theamount of letters, so it needs coded input method. For Chinese inputtingkey board, a special keyboard designed for inputting Chinese characters,because only pressing one key can input a Chinese character, in thiscase, the method belongs to no-coded input method. Therefore, what kindinput method should be used is determined by the key amount of akeyboard and the amount of the characters in a code system. If theamount of objects for one kind of objects is more than the amount ofkeys of a kind of keyboard, the inputting of an object in the objectscan't be done by only pressing one key, so object input codes which arecoded by more than 1 key are used to input the objects; this kind inputmethod is called coded input method. The characteristic of coded inputmethod is that in the object inputting, one object can relates to morethan one key in the keyboard. If the amount of objects for one kind ofobjects is not more than the amount of keys of a kind of keyboard, theinputting of an object in the objects can be done by pressing a singlekey, this kind input method is called no-coded input method. Thecharacteristic of no-coded input method is that in the object inputting,one object only relates to one key in the keyboard.

The object inputted by coded input method is called coded input object;and the object inputted by no-coded input method is called no-codedinput object.

Objects can also be inputted by image recognition or voice recognition,such as pen input or Optical character reader (OCR), or speechrecognition. The inputting by image recognition or voice recognitionfirst retrieves the characteristics of objects, then input objectsaccording to the characteristics. Here, an input code table is needed inthe object inputting, the input codes relates to the characteristics ofthe objects. Therefore, these are also coded input method.

The input method by image recognition or voice recognition can beno-coded input method, for example, in speech recognition, if phonemesis so selected that each phoneme only relates one object, that is onephoneme can be used to input one object, this kind input method isno-coded input method. If more than 1 phoneme is used to input objects,then the input method is coded input method. For example, characters thepronunciation of which consists of phonemes can be inputted by inputcodes, each of them consists of phonemes.

A big amount of objects represented by non-MMC, MMC or multiple propertycods are coded input objects. For example, embedded MMC can be inputted,or searched by the text in the related objects; in this case, therelated text can be as the input code of the objects.

Input codes of objects can be coded by one of properties of the objects,for example, the pinyin inputting of Chinese characters are based on thephoneme property of the characters; and stroke inputting of Chinesecharacters are based on the strokes of the characters. Input codes ofobjects may also be coded by multiple properties of the objects. Becausepolyphone Chinese characters, Chinese text to speech, the Pinyin outputof text, text inputting and text searching by phoneme exist severeproblems.

The processing method for coded input objects is as following:

the input code table of coded input objects only consists of therelation between each singular object and its input code, as the objectcode sequence is consistent with the input code sequence, then onlyconsists of input codes of the objects sequenced according to the objectcode system sequence; object codes can be no-multilevel mark code ormultilevel mark code; the processing method for coded input objectscomprises of generating the input code from the object codes, that is togenerate the input code for the data to be processed according to theobject input table; for multiple property object, eliminating theconfusion of input codes for one object code by multiple property codes.

The key step is “generating the input code of relative object codes forthe related data according to the object input code table ”; “therelated data” can be object groups of object group library or objects ofdata resource used for inputting; in object searching, “the relateddata” is the objects of data source to be searched.

Input codes can be input codes inputted by keyboard, or input codestransformed by image or voice.

The method above can be used in object inputting, object searching byinput codes, and object input code output.

Object inputting or object searching for coded input objects is asfollowing:

-   -   (1) inputting input codes for objects to be inputted or to        searched;    -   (2) generating input codes of object codes in related data        according to object input code table;    -   (3) retrieving the inputted objects or searched objects by        comparing the inputted input codes with the generated input        codes;

The output method of object input codes comprises of the steps of thefollowing:

Generating input codes of object codes in related data according toobject input code table by scanning the related data;

In the object processing above, multiple input code tables can be used,each relates to one input method; and one or multiple data sources canbe used; in the comparison, if too many objects are matched, inputtinganother input codes related to another input method, and then select theobjects to be inputted or searched from the objects or object groupsmatched multiple input code tables.

Above “The input code table of coded input objects only consists of therelation between each singular object and its input code” means that noobject group input codes In the object input code table, because theinput codes of object group consists of the input codes of the objectscomprised the object group, in object processing, object group table canbe used, but no input codes are needed in the object group table.

The said “related data” can be object groups of object group library orobjects of data resource used for inputting; in object searching, “therelated data” is the data source to be searched. However, the said“related data” does not include input code table. Object group table canbe the object group table the input method provided, and also can be theobject groups in the related data sources, such as in database, text,web pages, and the text included in image, voice and video files.

In order to process information conveniently, the item length of inputcode table can be equal, If not equal, can be transformed to equal bythe method similar to full pronunciation, or sequential fullpronunciation.

If the item length of input code table is equal, and if the sequence ofinput code table is consistent with the sequence of related object codesystem, then the input code table can only contain the input codes ofrelated objects.

Combined MMC of multiple character phrase or MMC of multiple characterphrase can be used in the object searching.

The English inputting in phone or hand phone can use the input method ofcoded input object.

The methods above can be used in Chinese inputting, searching andoutput.

the object is Chinese characters, input code table is sequential fullpronunciation table sequenced according to Chinese character codesequence. The sequenced full pronunciation of Chinese character codeconsists of two bytes, in which two letters, each letter 5 bits; and 5pronunciation notes, 3 bits; the size sequence of the two letters areselected according to the sequence of Chinese syllable represented byPinyin (or phonetic notation); the value of the 3 bits is selectedaccording to the note sequence in the standard dictionary.

The comparison of two Chinese character strings can be done by thecomparison between the related sequential full pronunciations of bothstrings.

The comparison processing of two Chinese strings are as following:

-   -   A. Retrieving the related sequential full pronunciations of both        strings    -   B. Comparing the related sequential full pronunciations of both        strings.

Get the comparison result of the strings by above comparison. Chineseinputting and searching can use phoneme codes, such as pinyin, zhuyin,or shuangpin; phoneme codes can be inputted by keyboard, or transformedby speech. Phonemes of Chinese information can output by sequential fullpronunciations.

The advantages of the methods above are as following:

Taking the book searching by author as an example, input the input codeof the author to be searched, such as pinyin string, then transformpinyin string into sequenced full pronunciation, at the same time,transform the author string in the author field into sequenced fullpronunciation according to sequenced full pronunciation input code Thecomparison can be done by the sequenced full pronunciation strings, usercan select the author in the comparison result. Similarly, the methodabove can be used in inputting. Because any data source can be used asobject group library, the author field of a database can be used asphrase library in inputting, it is easy to input author name by thelibrary. In the method above, the speed is high in searching andinputting. &

Because object group can be represented by MMC, so the object processingis in group level. If the object is text, then the text processing is inphrase level. Taking Chinese as an example, it is in characterprocessing level without MMC, however, with MMC, it is raised to phraselevel. Therefore, the searching is more precise.

The processing is more precise by multiple property code, for example,if polyphone characters are presented by multiple property code, then itis convenient to search Chinese information by phonemes.

For no-coded input object, the object codes consists of letters, soobjects can be searched by inputting letters.

However, for coded input object, the object searching used before thisinvention can be divided into two steps: first, transform the inputcodes into object codes; second, compare the object codes between thetransformed object codes and the object codes in the object data source.

For coded input object searching, In method proposed here, not necessarytransform input codes into object codes, The object is searched directlyby the input codes of the objects to be searched; and the object codesin the data source can be transformed into input codes; then compare theinput codes between the inputted input codes and the transformed inputcodes.

In order to realize the input code searching, an input code table isneeded, and one object code relates to only one input code string; ifrelates more than one input code string, then multiple property code isused to make the input code string singular. Multiple language text canbe searched by input code method.

It is easy to realize multiple language searching by input codesearching. Input one language, translate it into another language, aftersearching finished in the language, and then translate into originallanguage. Two languages can be searched by the third language. Forexample, in order to search French by Chinese, first translate Chineseinto English, then into French; and then translate the searched resultinto English, then into Chinese.

Input code searching method can be used in different code system. Forexample, In Chinese code GB2312 and Big5 system, if the searching is bycharacter codes, then input codes must be transformed into charactercodes in different code system first. However, by phoneme input codesearching, because the phonemes of a character are same for differentcode system, so it is only necessary to transform character codes intoinput codes by different input table.

The speed of input code searching is higher, the reason is: first, noneed to transform input codes into object codes in the inputting;second, the searching can be done as the input code inputting; third,the searching phrase library is extended library, the said extendedlibrary is that any phrases of any data sources can be phrases of phraselibrary; although larger the library may be, the library is closelyrelevant for searching. Speech searching can be divided into two steps:first, transform the voice information into input codes; second, searchthe objects by input codes.

In Chinese speech searching, first to transform the speech intophonemes, then search the text by phonemes.

Coded input object inputting is further explained as following:

Coded input objects can be various objects, such as character, image,voice and etc.

For examples, a paragraph of text can be inputted by some characters;inter mark MMC and embedded MMC can be inputted by some characters;similarly, for object group and combined MMC of object group. For animage library, the images can be inputted by text. This kind of inputmethod can be divided into two steps: first, if the related charactersare coded input objects, input the characters by input codes; otherwise,input the characters directly; second, take an input code table, whichconsists of the relations between text for inputting and MMC related tothe coded input objects. If input codes are designed for this kind ofobjects directly, then object inputting can be done by one step.

Input code table above is a relation table between single object and itsinput codes, object groups can be set up before or can be generatedautomatically during inputting, but in the object groups no input codesinside; during inputting, according to input code table, the objectgroups can be inputted by the input codes of objects consisted of therelated object group. Because there are no input codes for object groupsin input code table, multiple object group tables can be used ininputting; and object groups can be in different locations, in differentforms, and in different data resources, such as database, text files.So, this kind input code table is called as opened input code table.

In the input methods now used, one input method only relates to oneinput table, there is no a single method which can realize two inputmethods by inputting two kinds of input codes in one input code table,or by two input code tables.

An input method, which can input objects or object groups by multiplekinds of input codes, is called cascade input method. Cascade inputmethod can speed up inputting rate.

In above processing, multiple input tables can be used, and one ormultiple data resources can be included. As too many objects forselecting, inputting input codes of another input method, then selectthe objects to be inputted or searched in the objects for selecting.This is called as cascade input method.

A compound input code table can be used in cascade input method, and thecompound input code table contains the relation between objects andinput codes for two or more input methods. Cascade input method can usecompound input table or two or more input code tables for differentinput methods.

If the input codes of each item related to first input method in acompound input code table are in equal length, the second input codescan be put just after the first without separating by symbol.

The characteristics of cascade input method are as following:

Coded input objects can be inputted by cascade input method; in which acompound input code table can be used, the compound input code tablecontains the relation between objects and input codes for two or moreinput methods, or multiple input code tables can be used, each of thetables relates to one input method; cascade input method realize objector object group inputting by inputting input codes of two or more inputmethod, cascaded inputting comprising the steps of:

-   -   (1) inputting input codes of one class;    -   (2) comparing the inputted codes with related class of input        codes in the compound input code table or with input codes of        related single input code table;    -   (3) going to (5) if satisfied results can be selected;    -   (4) if too many objects or object groups to be selected,        inputting input codes of another input method, then selecting        the satisfied results;    -   (5) retrieving the satisfied results.

Cascade input method can be used in Chinese inputting, for example, thefirst input method is a phoneme method, such as pinyin, zhuyin orshuangpin;: the second input method is shape-phoneme method, in which,Chinese character is divided into two parts according to the shape ofthe character, and each part is represented by the phoneme of the part.This method can reduce redundant phrases or characters.

Cascade input method can be used in object searching by input codes.

If adopting multilevel mark in inputting procedure, the multilevel markinformation can be stored in the store media.

4. Multilevel Mark (M Mark)

In order to descript object, or multiple objects, in order to descriptthe properties of object, the relations of objects and the objectstructure; in order to distinguish objects and to distinguish objectgroups, marks can be used. Mark is also a kind of object. Marks can beclassified according to different standards.

Marks can be classified into bit mark, byte mark, letter combined mark,code mark, and etc. according to the symbols the mark used.

Marks can be classified into inner marks, inter marks, and outside marksaccording to the position the marks located.

Inner marks: marks located inside of object to be marked, such as thebit 1 mark in the first bit of the bytes of Chinese character.

Outside marks: the marks are outside the object to be marked.

Inter marks: the marks are among the objects in an object sequence.Inner marks, inter marks, and outer marks can be transformed from onekind to another.

Definition of A, B mark: Two different objects: A object and B object,used as mark, are called A mark and B mark respectively.

Here the different objects means that two objects can be distinguishedfrom each other. Two objects, which is complement each other, are oftenused as A mark and B mark; such as true and false in logic; a set andits complement set, a letter and not a letter; above a value and belowthe value; in a data and not in the data; empty and not empty, left andright, and so on; all can be used as A mark and B mark. Another example,in a B⁺ tree, the node without data is empty; the node with data, thenumber of data may be different, 2 or 3 data, no matter how many, thenode is not empty. This can also be used as A mark and B mark.

Two objects in binary system: 0, 1 can be used as A mark and B mark,which can also be called as 0 mark and 1 mark respectively.

Definition of object grouping: The procedure to distinguish the objectsinto object groups in an object sequence according to some ruler iscalled object grouping. In an object sequence, object grouping forconsecutive object is called consecutive grouping, otherwise is calledno-consecutive grouping.

The important role of A and B mark is to mark an object sequence torealize object grouping, that is to distinguish object groups in anobject sequence by A and B mark.

marking object groups in an object sequence with multilevel mark;

the said multilevel mark is one of the following:

Right A multilevel mark: the rightest object in an object group markedwith A mark; the other N objects in the object group marked with B mark;A Mark is called as group representative;

Left A multilevel mark: the leftest object in an object group markedwith A mark; the other N objects in the object group marked with B mark;A Mark is called as group representative;

Right B multilevel mark: the rightest object in an object group markedwith B mark; the other N objects in the object group marked with A mark;B Mark is called as group representative;

Left B multilevel mark: the leftest object in an object group markedwith B mark; the other N objects in the object group marked with A mark;B Mark is called as group representative;

The said N is positive integer or 0.

For non-consecutive grouping, if there are objects not belong to theobject group in consecutive objects, not mark these objects by A marknor B mark.

There are only two direction as scanning an object's sequence, left toright, or right to left; For convenience, the other directions, such asup down direction, down up, and etc. can relate left to right, and rightto left direction.

Examples are taken as following:

Right A mark example: A, BA, BBA, BBBA, BBBBA

Left A mark example: A, AB, ABB, ABBB, ABBBB

Right B mark example: B, AB, AAB, AAAB, AAAAB

Left B mark example: B, BA, BAA, BAAA, BAAAA

For Right A mark, A mark is called as group representative; if an objectsequence is marked as: ABABBABBBBABBBA; then 1 object in first group, 2objects in second group, 3 objects in third group, 5 objects in fourthgroup, 4 objects in the last group. The group representative inmultilevel mark is the sign of object group.

0 and 1 mark can be two different combinations of binary 0 and 1. Referto the example of non-consecutive mark in section 4.3 inter multilevelmark.

The four classes of M marks can be transformed from each other. Forexamples, the complement of each bit of right 0 M mark can betransformed into right 1 M mark; the complement of each bit of left 0 Mmark can be transformed into left 1 M mark; exchange the leftest bit andthe rightest bit of a right M mark, then become a left M mark. Inpractice,. the properties possessed y a right M mark as scanning fromleft to right is similar to properties possessed by a left M mark asscanning from right to left.

Object grouping can be done by any kind of M marks.

An M marks is called as inner M mark if the mark inside the object, an Mmarks is called as inter M mark if the mark among objects in an objectsequence, and an M marks is called as outer M mark if the mark outsidethe object.

4.1 Outer M mark

Outer M mark is in outside the object sequence to be marked,characterized as: the marks can't be confused with the objects to bemarked, and the marking is simple. An outer M mark marking method isillustrated by an example of marking Chinese and English text bottom up.In multiple language text, different text in different code system andusually in different code length, so it is difficult to distinguishthem. In order to distinguish different codes, outer M mark can be used.An example of marking procedure is illustrated in FIG. 3. In the exampleRight 1 M mark is used. Each byte can be marked in 1 marking byte.

First, marking each byte by the first bit of each marking byte, Englishletter is 1 byte code, marking 1; Chinese character is 2 byte code,marking 01.

Next, mark the words. Mark the first level word by the second bit ofeach marking byte, and second level word by the third bit of eachmarking byte. A word can consist of 1 character or 2 characters, and soon. Assume the fourth bit of each marking byte mark sentences, the fifthmarking paragraphs. Then 1 paragraph can form a tree.

The bottom line in FIG. 3 is the text to be marked.

The second line from bottom is mark for the text; English letter is 1byte code, marking 1; Chinese character is 2 byte code, marking 01.

The third line from bottom is word marking. Because the grouprepresentative is 1, so Chinese phrase or word

is marked 0, 1 respectively, and just above the 1 mark in the secondline. The marking procedure is similar to

and

“English” is a word, so mark 0000001; and “is” is a word, so mark 01.

The fourth line from bottom is marks for 3 character word or 4 characterword,

is consisted of

and

the “0” mark relates to the “1” mark of

and “1” mark relates to “1” mark of

is consisted of

and

the mark “0” and “1” relates to mark “1” of

and mark “1” of

respectively. The fifth line from bottom is marks for sentences. Thefirst sentence is consisted of

and

The marks of “0001” relate to each word. The second sentence isconsisted of “English”, “”, “is”, “”,

and “∘”, there are seven parts in this sentence, in which “” representsspace. Each of marks of “0000001” relates to the each part.

The sixth line from bottom is marks for paragraph consisted of twosentences, each of the marks of “01” relates to each sentence.

It is obvious that the marking procedure above forms a matrix, and thematrix can represent tree structure.

FIG. 3 is a block diagram illustrating marking procedure by right 1multilevel mark for string with Chinese and English text:

English is

and a matrix is generated by the marking procedure.

FIG. 4 is a block diagram illustrating tree generating procedure bymultilevel marking. The leaves of the tree relate to characters, theintersection of lines is inner node, and relate to words or sentences.In order to relate Right matrix later introduced, the tree is drawn intipsy form

M mark can be done in text inputting to mark the inputted text.

Inner M mark, inter M mark and M mark can be transformed from eachother.

M mark can be used in information transmission. For example, one channelis the information to be transmitted, and another channel is outer marksof the transmitted information.

Outer M mark can be used in intersect object grouping, for example, theintersect groping of

(Meaning camera)

Each character can use 3 bit marks, and right 0 M mark is used.

Right 0 M mark: 110, grouping the 3 characters into one group:

Right 0 M mark: 100, grouping the 3 characters into two groups:

Right 0 M mark: 010, grouping the 3 characters into two groups:

Here the character

can be grouped into

or grouped into

this can be called intersect grouping.

The object grouping of object sequence can be represented by matrix,some elements in which are M marks; the matrix can be used to representtree structure. The generation of matrix comprises of the followingsteps:

(1) Marking the objects in lowest level into object groups by M mark,and generating one row of matrix;

(2) Marking the objects in level higher into object groups by M mark,and generating one row of matrix upper;

(3) Repeating the step 2, until marking finished, and the matrixgenerated.

If binary 0, 1 M mark is used in the operations above, then a logicmatrix generated. If replacing the group representatives in the matrixby the related data, then an entity matrix of tree structure isgenerated.

The objects in an object sequence are grouped level by level; eachgrouping is the grouping of group representatives. The nodes of the treestructure represent the object or object groups.

4.2 Inner Multilevel Mark (Inner M Mark)

Inner M Mark can be used in coding of multilevel mark code.. Multilevelmark code is realized by multilevel mark. Object groups in an objectsequence are grouped and distinguished by multilevel mark marked in theobjects.

Objects can be represented and processed by multilevel mark code; thecoding of multilevel mark code comprising the steps of:

-   -   (1) Object code consisting of code segments, each segment        consists of coding locations;    -   (2) Selecting at least one coding location as marking location        for each segment;    -   (3) Marking the marking locations of each segment by multilevel        mark.

Codes coded according to the rules of multilevel mark are called asmultilevel mark code; otherwise are called as non-multilevel mark code.Therefore, the single machine code and multilevel machine codes in thereferences CN 1122476A and CN 11 82234A are all non-multilevel markcode.

Some examples of multilevel mark code are as following:

Assume one code segment consists of four bytes, A mark is any letter of26 English letters, and B mark is any one in {0, 1 . . . 9}. The markinglocation is the first byte of each segment, and is marked by right Bmultilevel mark. Then, the codes following is satisfied the condition ofmultilevel mark code.

abcd 26ds; 4hgd; gb3d ji34 2arf;

The examples above show that the number of code segments is notrestrained. The first code consists of 2 segments, the second codeconsists of 1 segment, and the third code consists of 3 segments.

Another characteristic of the codes above is that the letter and digitare marks. This is different from multilevel mark code in binary codes,there, in the mark locations are only o, 1marks.

The most useful MMC consists of binary bits. This has been introduced insection 1;—the related methods introduced there can also be used in thissection.

The length of segments can be different, for example, 2, 4, 8 bits, ormultiple bytes. However, it must be defined before.

English word can be grouped by multilevel mark. As everyone knows, thefirst bit of each English letter is 0. Taking right 0 multilevel mark,the first bit of each letter for a word is set to 1, except the lastletter of the word. Then, the word is marked by multilevel mark. Ifdeleting the spaces between the words, the words can be distinguishedalso.

Objects in time sequence can be grouped by multilevel mark. For example,

1 X 1 X 1 X 1 X 1 X 1 X 1 X 0 X

Multilevel marks are marked in the data above, which is 8 bits data.Here, “X” represents 0 or 1. In the data transmission, if errors of themarks appeared, the possibility of data errors happened in highprobability. Errors of data transmission can be analyzed by this method.

The outer M mark can be transformed to inner mark, the outer M mark ofword or phrase can be transformed to inner M mark of combined MMC. Theinner M mark of MMC of Chinese phrase can be transformed to outer Mmark.

4.3 Inter Multilevel mark (Inter M mark)

Inter M mark is marked among the objects in object sequence. Inter Mmark can be divided into consecutive and non-consecutive M mark. Takingan example:

1 1 1 1 1 1 1 0 X X X X X X X X

In the table above, “X” represents data. 11111110 is consecutive mark.It says the next 8 bits are a group of data. There is at least one bitof data, a least one bit of mark in applications of inter M mark.

As multiple bits marking, different marks can group different objectgroup, and represent different meaning.

For example, assuming 4 English words, and assuming a phrase consists ofthe first word, the third word and the fourth word. Putting followingmarks in the front of each word,

11 10 11 00

And declare that with marks “11” means the word is part of the phrase,with mark “10” means the word not part of the phrase. In the example,the M mark 1 relates to “11”, and the M mark 0 relates to “00”. This isan example of non-consecutive inter M mark. As right 0 mark used innon-consecutive inter M marks, from the first mark to right, if the markmet is not mark 1, nor mark 0, then the object isn't belong to theobject group. The object group is ended as mark 0 is met. If first markis 0, then the object group only contains one object. This method can beused in inner M mark and outer M mark.

M mark can be used to group objects in a set, that is to say the set canbe represented by one segment or multiple segments, each segmentcontains marking element, which is represented by M mark.

In information transmission and communication, the information consistsof bits, called bit stream, M mark can be used to mark the bit stream.The method to mark a bit stream by outer M mark is described asfollowing:

Marking each bit of the bit stream by outer M marks, and generating amark bit stream;

Sending data bit stream and mark bit stream respectively.

If the method to mark a bit stream by inner M mark is described asfollowing:

Dividing the bit stream into segments with equal length;

Adding mark bit to each segment;

Marking the mark bits with M marks.

5. Representing and Processing of Object Tree Structure

Tree can be represented by tree structure, or data of the tree (entityof tree), or both structure and data. The data of tree nodes can be indifferent forms, such as one data, multiple data, image, text, voice andetc. As it is difficult to represent the data in the node, the node canstore pointer or MMC, which points to the real data. A tree whose nodecan have arbitrary number of children is called general tree. A treewhose child node can have one more children than its parent have iscalled A-B tree, such as binary trees 2-3tree, B-tree, B⁺ tree and etc.For multiple dimension tree, the children number is N=2^(D), here, N ischildren number, D is number of dimension. Tree and sub tree can berepresented by matrix. Matrix can be classified into right matrix, leftmatrix, left right matrix, trapezoid matrix, list matrix, and blockmatrix. The operations of tree can be classified into structureoperation and entity operation. If the structure operation of two treesis not equal, then, the tree can't be equal. The operation of trees canbe divided into two steps: structure operation and entity operation.

An object sequence can be marked by M mark, a matrix can be generated bythe marking procedure, and the matrix can be used to represent treestructure. Tree nodes can be marked by M mark, and this can alsogenerate a matrix to represent the tree. An object sequence can begrouped level by level. The first level marks are used to group theobject sequence first; afterward, each grouping is to group the grouprepresentatives in the lower level; marks of each level relates to onerow (or one column) of the related matrix; each mark is an element ofthe matrix. An element of the matrix can include multiple data for A-Btree or multiple dimension tree. Matrix can be structure matrix, orentity matrix of a tree. Structure matrix stores binary 0, or 1. Entitymatrix stores real data.

The marking procedure of nodes of a tree is similar to the procedureabove.

The marking procedure of object sequence is illustrated FIG. 3 and FIG.4.

Tree structure can be represented by left matrix, or right matrix, orleft right matrix, or trapezoid matrix, or list matrix;

the characteristics of right matrix of a tree is: the root of tree atthe right corner of the matrix; the first generation children of theparent at the next row, the children counted from right to left, therightest child of them just below the parent, and the other children atthe left side;

the characteristics of left matrix of a tree is: the root of tree at theleft corner of the matrix; the first generation children of the parentat the next row, the children counted from left to right, the leftestchild of them just below the parent, and the other children at the rightside;

the characteristics of left right matrix of a tree is: the root of treeat the first row of the matrix, the first generation children of theparent at the next row, and divided them into two parts: left part andright part; the right part right to the parent, and the left part leftto the parent; in each column of the matrix only one element related toone node of tree structure;

the characteristics of trapezoid matrix of a tree is: the trapezoidmatrix is related to N order tree structure, consisting of one dimensionarrays arranged from top to bottom, the number element of the arrays iscalculated by N^(m), (m=0,1,2 . . . ), and the elements related to thenodes of the tree structure;

the characteristics of list matrix of a tree is: list matrix is relatedto trapezoid matrix; putting elements of trapezoid matrix into a list,from top to bottom and from left to right, formed the list matrix.

The matrixes above are illustrated in FIG. 5, FIG. 6, . . . , and FIG.12.

Processing methods are described as following:

Object processing comprising at least one step of the following:

searching child by right matrix(left matrix is similar): if child exist,the child are in the row bellow the current node, and between the columnthe current node located, and the column just right to the parent'sfirst left sibling,

searching parent by right matrix (left matrix is similar): scan from thenode just above current node to its right, the first node not null isthe parent;

searching sibling by right matrix (left matrix is similar): all nodes inthe row the current node located, are sibling each other;

searching family by right matrix (left matrix is similar): the familyfor the current node is a sub array: the above row is the row thecurrent node located, the right column is the column the current nodelocated, the left column is the column just right to the current node'snearest left sibling; or leftest column of the R array, if no leftsibling;

searching child by trapezoid matrix: if current is J-th node on [K−1]-tharray, then, the first child of the node is in [K]-th array, the elementis N*J;

searching parent by trapezoid matrix: if current is J-th node on [K]-tharray, then, the parent is in [K−1]-th array, the element is J/N;

searching sibling by trapezoid matrix: the nodes on K-th array aresibling each other;

searching the node data number by trapezoid matrix: if the tree is A-Btree, DATANUMBER=N−1;

Assume a node in the position of J in the list array; then

searching level by list matrix: if the node J in [N^(k), N^(k+1)), thenthe level of J is k;

searching child for the node J by list matrix: assuming J is at the Klevel; the first child is in (K+1) level, and at N^(k+1)+(J−N^(k))*N;

searching parent by list matrix: assuming J is at the K level; theparent is in (K−1) level, and at N^(k−1)+(J−N^(k))/N;

searching sibling by list matrix: the nodes in [N^(k), N^(k+1)) aresibling each other; searching child for the root by list matrix:assuming the searched data is X, assume the order is N, according to thetree type, A-B tree or multi-dimension tree, calculate the child numberfor each node passed, assume the child number for each level are j₀, j₁,j₂, . . . j_(k); for each search step, the child number from the rootcan be calculated by N^(k+1)+(J_(k)−N^(k))*N.

5.1 Right Matrix (RA), Left Matrix (LA), Left Right Matrix (LRA) Matrix

Matrix of a tree can be generated by tree nodes marked by M mark, or bymarking an object sequence by M mark.

Multilevel mark can be classified four classes: Right A multilevel mark,Left A multilevel mark, Right B multilevel mark, and Left B multilevelmark. The matrix related to multilevel mark above is Right A matrix,Left A matrix, Right B matrix, and Left B matrix respectively.

Right A matrix, Left A matrix, Left Right matrix is represented by RA,LA, LRA respectively.

Structure and data of a tree can be store in one matrix, and also canstore in matrixes separately. It can usually save space if the treestructure is stored in a binary matrix, and the tree data is stored in alist matrix. It is compact and simple if the structure and the datastore in one matrix, but usually needs more space than former.

The element of tree node can be empty or objects, which can be sameclass of objects, or different classes of objects. If different classesof objects, the node can store pointers, each of which points to therelated object.

Tree matrix can be generated top down, root to leave; or down up, leaveto root. The generating of matrix by marking an object sequence is shownin Section 4.

FIG. 5 is a block diagram illustrating a tree.

FIG. 6 is a block diagram illustrating Left Right Matrix (LRA) of FIG.5, in each column of the matrix only one element related to one node oftree structure; the first generation children of the parent at the nextrow, and divided them into two parts: left part and right part; theright part right to the parent, and the left part left to the parent.

FIG. 7 is a block diagram illustrating Right Matrix (RA) of FIG. 5, theroot of tree at the right corner of the matrix; the first generationchildren of the parent at the next row. RA can be generated bycompressing LRA, compressing Right child of each node of LRA to thecolumn of the node from the right side of the matrix, and keeping thecorrect relation of the family.

FIG. 8 is a block diagram illustrating Left Matrix (LA)of FIG. 5, theroot of tree at the left corner of the matrix; the first generationchildren of the parent at the next row. LA can be generated bycompressing LRA, compressing Left child of each node of LRA to thecolumn of the node from the Left side of the matrix, and keeping thecorrect relation of the family.

Alphabetic Tree by RA Matrix

Alphabetic tree relates to a set of key words.

EXAMPLE 1

Assume a set of key words is: K={xem,xul,xal,wul,wen,wim,wil,wan,zi,zom,zol,yum,yon,yo}.

(Refer to: Data structure, Xu Zhuoqun, High Education publication CoPublish C. 1987).

FIG. 9 is a block diagram illustrating an alphabetic tree represented byRA matrix.

EXAMPLE 2

An alphabetic trie represented by RA matrix.

FIG. 10 is a block diagram illustrating an alphabetic tree, trie; thetrie structure in (a) is represented by Right Matrix, and the lettersare filled in according to the word in (b), related: word: ant,anteater, antelope, chicken, deer, duck, goat, goldfish, goose, horse.

(Refer to: FIG. 13.2, A practical introduction to Data structures andAlgorithm Analysis, second edition , By Shaffer, C. A., Electronicpublication Co. 2002)

5.2 Trapezoid Matrix

Definition of trapezoid matrix: a matrix consists of 0, 1, . . . , Krows, the number of elements of rows satisfy the condition: N^(k), N) 1and N is integer.

Trapezoid matrix can be used to represent A-B tree, this case, theelements relate to nodes of the tree.

There are (N−1) data in an N order A-B tree. The relation of dimension Dand the order of a multiple dimension tree is N=2^(D), D=log₂N.Therefore, the nodes of trapezoid matrix for A-B tree are (N−1), and Dfor multiple dimension tree. The data of nodes can also be stored in aone dimension array.

The trapezoid matrix is called as logic trapezoid matrix, if theelements of the nodes are 0, or 1. Logic matrix can be used to representthe tree structure. The data of a tree can be store consecutively in anarray if there is a logic matrix for the tree, thus to save space. It iseasy to recover the matrix with real data by the logic matrix and thedata array.

FIG. 11 is a block diagram illustrating B⁺ tree. (Refer to: FIG. 10.16,A practical introduction to Data structures and Algorithm Analysis,second edition, By Shaffer, C. A., Electronic publication Co. 2002).

FIG. 12 is a block diagram illustrating trapezoid matrix of B⁺ tree inFIG. 11. In the FIG. 12, a Left matrix is used. A Right matrix may beused, and proper matrix can be selected according to practice.

Next is representing trapezoid matrix by arrays.

Array R1, R2, R3 relate the first row to third row in FIG. 12respectively.

R3[3]=33;

R2[4][3]=18,23; 48

R1[16][5]=18,12,26;18,19,20,21,22;23,30,31;33,46,47;48,60,62;

5.3 List Matrix

Definition of list matrix: putting the elements of a trapezoid matrixinto a one dimension array according to top down, left right sequence,then form a list, which is called as list matrix.

List matrix can be used to represent A-B tree and multiple dimensiontree.

There are (N−1) data in an N order A-B tree. The relation of dimension Dand the order of a multiple dimension tree is N=2^(D), D=log₂N.Therefore, the nodes of list matrix for A-B tree are (N−1), and D formultiple dimension tree.

The list matrix is called as logic list matrix, if the elements of thenodes are 0, or 1. Logic matrix can be used to represent the treestructure. The data of a tree can be store consecutively in an array ifthere is a logic matrix for the tree, thus to save space. It is easy torecover the list matrix with real data by the logic matrix and the dataarray. Binary tree, 4 tree (parent node can have 4 children at most),and B⁺ tree all can be represented by list matrix.

List matrix can be segmented, and segments can be linked by M mark.

5.4 Block Matrix

The said block matrix consists of sub tree. Sub tree can be representedby one kind of matrix above, the matrix related to sub tree is calledblock matrix. The block matrix, which does not include the root at topof the whole tree, relates to one leave of the block matrix in one levelhigher.

6. Object Processing Apparatus

The object processing methods above can be described as apparatus.

In object representing and processing, object can be represented bynon-MMC, MMC, or M Mark; a apparatus consisting of at least oneapparatus of the following:

(1) multilevel mark code apparatus: consisting of code segments, thesaid code segment consisting of binary bit; mark bit selected in eachsegment; and the mark bits of the segments of object code marked withmultilevel mark;

(2) multilevel mark code apparatus with class segment: a multilevel markcode apparatus for multiple classes of objects consisting of two parts:class part and coding part; for N coding segment MMC, if N>2, then theclass segment contains the information which can tell if the nextsegment (or segments) is class segment or coding segment (or segments);if N=2, then class segment contains the information which can tell thecoding class of the next segment; the class part can be one or multiplesegments;

(3) one of the following multilevel mark code apparatus:

-   -   (A) Linked multilevel mark code which is related to an object in        the data source by its location;    -   (B) embedded multilevel mark code which is embedded together        with its related object, and coded the code range inside;    -   (C) Inter mark multilevel mark code which is located in object        sequence and coding the object relation information inside;    -   (D) combined multilevel mark code which combines the object        codes according to coding method of multilevel mark code;    -   (4) multiple property code apparatus: consisting of multiple        property codes for multiple property object, each of the        multiple property codes related to one property of the multiple        property object, and the multiple property object in an object        group is represented by multiple property code related to its        property;    -   (5) sequenced full pronunciation apparatus of Chinese character:        the apparatus consisting of two bytes, in which two letters,        each letter 5 bits; and 5 pronunciation notes, 3 bits; the size        sequence of the two letters are selected according with the        sequence of Chinese syllable represented by Pinyin (or phonetic        notation); the value of the 3 bits is selected according with        the note sequence in the standard dictionary;

(6) sequenced full pronunciation comparing apparatus: the apparatusconsisting of

-   -   (A) retrieving apparatus: retrieving sequenced full        pronunciation string from two Chinese character strings;    -   (B) comparing apparatus: comparing the strings by comparing        related sequenced full pronunciation strings;    -   retrieving comparison result of character strings according to        the comparison above;

(6) coded input object processing apparatus: consisting of object inputcode table; the input code table of coded input objects consists of therelation between each singular object and its input code, as the objectcode sequence is consistent with the input code sequence, then onlyconsists of input codes of the objects sequenced according to the objectcode sequence; object codes can be no-multilevel mark code or multilevelmark code; the processing method for coded objects comprises the step ofgenerating the input code for the data to be processed according to theobject input table;

for multiple property object, eliminating the confusion of input codesfor one object code by multiple property codes;

Object inputting or object searching for coded input objects comprisingthe steps of:

-   -   (A) inputting input codes for objects to be inputted or to be        searched;    -   (B) generating input codes of object codes in related data        according to object input code table;    -   (C) getting the inputted objects or searched objects by        comparing the inputted input codes with the generated input        codes;

The output method of object input codes comprises of the steps of thefollowing: Generating input codes of object codes in related dataaccording to object input code table;

In the object processing above, multiple input code tables can be used,may contain one or multiple data resource; In the comparison, if toomany objects matched, input the input codes related to another inputmethod, and then select the result matched input codes related tomultiple input methods;

-   -   (8) Coded input objects cascade input apparatus consist of: a        compound input code table, which contains the relation between        objects and input codes for two or more input methods; or        multiple input code tables, each of which relates to one input        method; cascade input method realize object or object group        inputting by inputting input codes of two or more input method,        cascaded inputting comprising the steps of:    -   (A) inputting input codes of one class;    -   (B) comparing the inputted codes with related class of input        codes in the compound input code table or with input codes of        related single input code table;    -   (C) go to (E) if the satisfied results selected, then;    -   (D) if too many objects or object groups to be selected,        inputting input codes of another input method, then selecting        the satisfied results;    -   (E) retrieving the satisfied results;

(9) multilevel mark apparatus: consisting at least one of the followingapparatus:

Right A multilevel mark apparatus: the rightest object in an objectgroup marked with A mark; the other N objects in the object group with Bmark; A Mark is called as group representative;

Left A multilevel mark apparatus: the leftest object in an object groupmarked with A mark; the other N objects in the object group with B mark;A Mark is called as group representative;

Right B multilevel mark apparatus: the rightest object in an objectgroup marked with B mark; the other N objects in the object group with Amark; B Mark is called as group representative;

Left B multilevel mark apparatus: the leftest object in an object groupmarked with B mark; the other N objects in the object group with A mark;B Mark is called as group representative;

The said N is positive integer or 0.

Taking electronic dictionary as an example of the apparatus, whichconsists of MMC apparatus with multiple languages, coded input objectinput apparatus and sequential full pronunciation apparatus for Chinesecharacter codes. Comparing the dictionary used now, the dictionary cansave storage space, and provide word level inputting, input codesearching function. Languages in the dictionary is unlimited by .MMC.With Chinese multiple property apparatus, the dictionary can outputcorrect pinyin and Chinese speech.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram illustrating a 2 byte MMC with class byte.Right 0 multilevel mark is adopted for the MMC, the first byte is aclass byte with mark bit 1 in the first bit and with mark bit 0 in thefirst bit of the second byte.

FIG. 2 is a block diagram illustrating a 3 byte MMC. Right 0 multilevelmark is adopted for the MMC, the mark bit is in the first bit of eachbyte, and the value of the mark bits is 1, 1, and 0, respectively. Thefirst byte is a class byte, which tells that if the second byte is aclass byte or the class of the code consisted of the second and thethird byte. If the second byte is a class byte, then it tells what kindof code the third byte is.

FIG. 3 is a block diagram illustrating marking procedure by right 1multilevel mark for string with Chinese and English text:

English is

and a matrix is generated by the marking procedure. 0, 1 in FIG. 3 aremarks.

FIG. 4 is a block diagram illustrating tree generating procedure bymultilevel marking. The leaves of the tree is represented by characters,the intersection of lines is inner node, and the phrase or sentenceconsisted by characters can be represented by inner nodes.

FIG. 5 is a block diagram illustrating a tree, and in the figure, A, B,C . . . J is nodes of the tree.

FIG. 6 is a block diagram illustrating Left Right Matrix (LRA) of FIG.5, in each column of the matrix only one element existed and related toone node of tree structure; the first generation children of the parentat the next row, and divided into two parts: left part and right part;the right part right to the parent, and the left part left to theparent.

FIG. 7 is a block diagram illustrating Right Matrix (RA) of FIG. 5, theroot of tree at the right corner of the matrix; the first generationchildren of the parent at the next row, the rightest child of them justbelow the parent, and the other children at the left side.

FIG. 8 is a block diagram illustrating Left Matrix (LA) of FIG. 5, theroot of tree at the left corner of the matrix; the first generationchildren of the parent at the next row, the leftest child of them justbelow the parent, and the other children at the right side.

FIG. 9 is a block diagram illustrating an alphabetic tree by RA, the setof key words is: K={xem,xul,xal,wul,wen,wim,wil,wan,zi,zom,zol,yum,yon,yo}.

FIG. 10 is a block diagram illustrating an alphabetic tree, trie; (a) isthe trie structure represented by Right Matrix, and the letters arefilled in according to the words in (b), related words: ant, anteater,antelope, chicken, deer, duck, goat, goldfish, goose, horse.

FIG. 11 is a block diagram illustrating B⁺ tree of order 4. Three groupsof data can be stored in a node. The leaf nodes can store 5 groups ofdata at most, the dada listed in the Figure are data stored in thenodes.

FIG. 12 is a block diagram illustrating trapezoid matrix of B⁺ tree inFIG. 11. FIG. 12 (a) lists the element value in the trapezoid matrix,because of at most 3 groups of data in a node, at most 4 children forone node. 3 elements in the first level of the trapezoid matrix; 12elements in the second level; the leaf nodes in the third level, 5records can be stored in each node at most; only part of the nodeslisted in the FIG 12 (b), which is a simplified form of (a), One elementin the matrix is related to one node in (a), the value is the first datain each node.

MORE EXAMPLES TO REALIZE THE INVENTION

Besides examples introduced above, more examples are as following:

Operating system for multiple objects and multiple languages: the numberof languages is unlimited, and the character codes are consistent withtheir original codes. System stores the MMC classes, MMC and non-MMC canbe transformed from each other. Text in original codes of a language isstore in external storage media; and the MMC form is stored inside themain memory. MMC codes of a language are transformed into original codesas output. Operating system can provide input method, search methods forcoded input objects presented in this invention.

It is easy to realize multiple language searching, web addressing, anddomain name by MMC. Any objects, such as video, voice, image; can beused in text and just like characters.

It is easy to raise Chinese searching from character level to word levelsearching by this invention, and the searching become more precise, morecomplete and quicker. If the marks in marking up languages and formattedtext are replaced by MMC, then space saved, and operations simplified.

1. A method of Object processing, wherein the object is encoded bymultilevel mark code; and the encoding of multilevel mark codecomprising the steps of: (1) object code consisting of code segments,and the code segment consisting of binary bits; (2) selecting mark bitin each segment of the code; (3) marking mark bits of the code segmentsby multilevel mark.
 2. The method as defined in claim 1 wherein furthercomprising at least one step of the following steps: (1) transformingnon-multilevel mark code into multilevel mark code by multileveldirection transformation; (2) transforming multilevel mark code intonon-multilevel mark code by single direction transformation; the saidtransforming is based on the relation between non-multilevel mark codeand multilevel mark code; the relation is as following: assuming thenon-multilevel mark code of one kind of object consisting of N bytes,N>=1, among them if there are M bytes without mark bit, then the codecan be represented by 2^(M) classes of multilevel mark codes; adding 1class byte, there are 2^(M) different kind of values, each relating toone of 2^(M) class multilevel mark code; if there are J kind ofnon-multilevel mark codes with N bytes, each kind of them can berepresented by K₁, K₂, . . . K_(j) of multilevel mark code respectively,then amount of all classes of multilevel mark code isS=K ₁ +K ₂ + . . . +K _(j) If the class byte can't have enough space torepresent all of them, adding more class byte to the multilevel markcode; the said multilevel direction transformation comprising steps of:(1) recognizing the class of non-multilevel mark code; (2) selecting theclass byte of multilevel mark code according to the class ofnon-multilevel mark code and its related mark bit value of themultilevel mark code; (3) taking non-multilevel mark code as the codingpart of multilevel mark code, and making the mark bits according to thecoding requirement of multilevel mark code; (4) combing the class byteand the coding part as multilevel mark code of the non-multilevel markcode; the class part can be omitted if the class of multilevel mark codecan be known by the context; if the length of multilevel mark code isequal to the length of a kind of multilevel mark code without classpart, adding one class segment more which contains the class informationof the class segment of original multilevel mark code; the rest isdeduced by analogy; the said single direction transformation comprisingsteps of: resuming the non-multilevel mark code by taking the codingpart of multilevel mark code as the non-multilevel mark code, accordingto the relation of class byte and related non-multilevel mark code, andaccording to the mark bit value; removing the related class byte ifexists.
 3. The method as defined in claim 1 wherein further comprisingat least one of the following multilevel mark code: Linked multilevelmark code which is related to an object in the data source by itslocation; embedded multilevel mark code which is embedded together withits related object, and coded the code range inside; Inter markmultilevel mark code which is located in object sequence and coding theobject relation information inside; combined multilevel mark code whichcombines the object codes according to coding method of multilevel markcode.
 4. The method as defined in claim 1 wherein the characteristic ofobject processing is as following: a sequence consisted of N nodes,N>=1, each node related to an object, which is called node object, nodeobject consisted of M sub objects, M>=0; then, node object and sub nodeobject can be represented and processed by multilevel mark code.
 5. Themethod as defined in claim 1 wherein the data compression is processedby one of the following steps: (1) representing the repeating times bymultilevel mark code as compressing repeated objects in an objectsequence; (2) representing objects in high frequency by shortermultilevel mark code; (3) representing objects in low value by shortermultilevel mark code; (4) Setting up an object group library outside theobject sequence, and representing the related object group in thesequence by multilevel mark code; (5) Setting up an object group libraryinside the object sequence, and representing the related object group inthe sequence by multilevel mark code.
 6. A method of Object representingand processing, characterized as following: representing and processinga multiple property object by multiple property code; relating amultiple property code with a property of the multiple property object;representing a multiple property object in an object group by relativemultiple property code.
 7. The method as defined in claim 6 wherein thesaid object is Chinese polyphone character with N pronunciations, N>1;encoding (N−1) multiple property codes to the character, each relatingto one pronunciation; representing a polyphone character in a phrase byits multiple property code with correct pronunciation.
 8. The method asdefined in claim 7 wherein further characterized as following: forChinese polyphone character with N different pronunciations, the added(N−1) multi-property codes with mark bit 1, 1, if the original code ofthe polyphone character belongs to GB2312, i.e. with mark bit 1, 0; ifthe original code of the polyphone character belongs to GBK but not inGB2312, i.e. the original mark bit of second byte is
 0. 9. A method ofObject representing and processing wherein characterized as following:the input code table of coded input objects only consists of therelation between each singular object and its input code, as the objectcode sequence is consistent with the input code sequence, then onlyconsists of input codes of the objects sequenced according to the objectcode system sequence; object codes can be no-multilevel mark code ormultilevel mark code; the processing method for coded input objectscomprises of generating the input code from the object codes, that is togenerate the input code for the data to be processed according to theobject input table; for multiple property object, eliminating theconfusion of input codes for one object code by multiple property codes;Object inputting or object searching for coded input objects comprisingthe steps of: (1) inputting input codes for objects to be inputted or tobe searched; (2) generating input codes of object codes in related dataaccording to object input code table; (3) retrieving the inputtedobjects or searched objects by comparing the inputted input codes withthe generated input codes; The output method of object input codescomprises of the steps of the following: Generating input codes ofobject codes in related data according to object input code table; Inthe object processing above, multiple input code tables can be used,each relates to one input method; and one or multiple data sources canbe used; in the comparison, if too many objects are matched, inputtinganother input codes related to another input method, and then select theobjects to be inputted or searched from the objects or object groupsmatched multiple input code tables.
 10. The method as defined in claim 9wherein the object is Chinese character, input code table consists ofChinese sequenced full pronunciations sequenced according to the Chinesecharacter code system sequence; the said Chinese sequenced fullpronunciation of a character code consists of two bytes, in which twoletters, each letter 5 bits; and 5 pronunciation notes, 3 bits; the sizesequence of the two letters are selected according with the sequence ofChinese syllable represented by Pinyin (or phonetic notation); the valueof the 3 bits is selected according to the note sequence in the standarddictionary.
 11. A method of Object representing and processing,characterized as following: Coded input objects can be inputted bycascade input method; in which a compound input code table can be used,the compound input code table contains the relation between objects andinput codes for two or more input methods, or multiple input code tablescan be used, each of the tables relates to one input method; cascadeinput method realize object or object group inputting by inputting inputcodes of two or more input method, cascaded inputting comprising thesteps of: (1) inputting input codes of one input method; (2) comparingthe inputted codes with input codes related in the compound input codetable or with input codes related in a single input code table; (3)going to (5) if objects or object groups to be selected are satisfied;(4) if too many objects or object groups waiting to be selected,inputting input codes of another input method, then selecting theresults satisfied both input methods; (5) retrieving the satisfiedresults.
 12. A method of Object representing and processing,characterized as following: marking object groups in an object sequencewith multilevel mark; the said multilevel mark is one of the following:Right A multilevel mark: the rightest object in an object group markedwith A mark; the other N objects in the object group marked with B mark;A Mark is called as group representative; Left A multilevel mark: theleftest object in an object group marked with A mark; the other Nobjects in the object group marked with B mark; A Mark is called asgroup representative; Right B multilevel mark: the rightest object in anobject group marked with B mark; the other N objects in the object groupmarked with A mark; B Mark is called as group representative; Left Bmultilevel mark: the leftest object in an object group marked with Bmark; the other N objects in the object group marked with A mark; B Markis called as group representative; The said N is positive integer or 0.13. The method as defined in claim 12, wherein objects are encoded bymultilevel mark code; the coding of multilevel mark code comprising thesteps of: (1) object code consisting of code segments, each segmentconsists of coding locations; (2) selecting at least one coding locationas marking location for each segment; (3) marking the marking locationsof each segment by multilevel mark.
 14. The method as defined in claim13 wherein the coding of multilevel mark code for multiple classes ofobjects, dividing multilevel mark code into two parts: class part andcoding part; for N coding segments of multilevel mark code, if N>2, thenthe class segment contains the information which can tell if the nextsegment (or segments) is class segment or class information of codingsegment (or segments); if N=2, then class segment contains theinformation which can tell the coding class of the next segment; theclass segment can be one or multiple segments; the class segment canalso be part of a segment; as scanning the multilevel mark code sequencewith multiple classes along one direction, if the code changes from oneclass to another class, then the multilevel mark code should be withclass segment related to the object class; later if the class ofmultilevel mark code not change, then the class segment of multilevelmark code can be omitted; if it is a default class of multilevel markcode for a definite length of multilevel mark code, then the classsegment of multilevel mark code can also be omitted.
 15. A method ofObject representing and processing, characterized as following:representing and processing tree structure by right matrix, or leftmatrix, or left right matrix, or trapezoid matrix, or list matrix; thecharacteristics of right matrix of a tree is: the root of tree at theright corner of the matrix; the first generation children of the parentat the next row, the children counted from right to left, the rightestchild of them just below the parent, and the other children at the leftside; the characteristics of left matrix of a tree is: the root of treeat the left corner of the matrix; the first generation children of theparent at the next row, the children counted from left to right, theleftest child of them just below the parent, and the other children atthe right side; the characteristics of left right matrix of a tree is:the root of tree at the first row of the matrix, the first generationchildren of the parent at the next row, and divided them into two parts:left part and right part; the right part right to the parent, and theleft part left to the parent; in each column of the matrix only oneelement related to one node of tree structure; the characteristics oftrapezoid matrix of a tree is: the trapezoid matrix is related to Norder tree structure, consisting of one dimension arrays arranged fromtop to bottom, the number element of the arrays is calculated by N^(m),(m=0,1,2 . . . ), and the elements related to the nodes of the treestructure; the characteristics of list matrix of a tree is: list matrixis related to trapezoid matrix; putting elements of trapezoid matrixinto a list, from top to bottom and from left to right, formed the listmatrix.
 16. A apparatus of object representing and processing, theobject is represented by non-multilevel mark code, or multilevel markcode, or multilevel mark, wherein comprising at least one of thefollowing apparatus: (1) multilevel mark code apparatus: consisting ofcode segments, the said code segment consisting of binary bit; mark bitselected in each segment; and the mark bits of the segments of objectcode marked with multilevel mark; (2) multilevel mark code apparatuswith class segment: a multilevel mark code apparatus for multipleclasses of objects consisting of two parts: class part and coding part;for N coding segment MMC, if N>2, then the class segment contains theinformation which can tell if the next segment (or segments) is classsegment or coding segment (or segments); if N=2, then class segmentcontains the information which can tell the coding class of the nextsegment; the class part can be one or multiple segments; (3) one of thefollowing multilevel mark code apparatus: (A) Linked multilevel markcode which is related to an object in the data source by its location;(B) embedded multilevel mark code which is embedded together with itsrelated object, and coded the code range inside; (C) Inter markmultilevel mark code which is located in object sequence and coding theobject relation information inside; (D) combined multilevel mark codewhich combines the object codes according to coding method of multilevelmark code; (4) multiple property code apparatus: consisting of multipleproperty codes for multiple property object, each of the multipleproperty codes related to one property of the multiple property object,and the multiple property object in an object group is represented bymultiple property code related to its property; (5) sequenced fullpronunciation apparatus of Chinese character: the apparatus consistingof two bytes, in which two letters, each letter 5 bits; and 5pronunciation notes, 3 bits; the size sequence of the two letters areselected according with the sequence of Chinese syllable represented byPinyin (or phonetic notation); the value of the 3 bits is selectedaccording with the note sequence in the standard dictionary; (6)sequenced full pronunciation comparing apparatus: the apparatusconsisting of (A) retrieving apparatus: retrieving sequenced fullpronunciation string from two Chinese character strings; (B) comparingapparatus: comparing the strings by comparing related sequenced fullpronunciation strings; retrieving comparison result of character stringsaccording to the comparison above; (7) coded input object processingapparatus: consisting of object input code table; the input code tableof coded input objects consists of the relation between each singularobject and its input code, as the object code sequence is consistentwith the input code sequence, then only consists of input codes of theobjects sequenced according to the object code sequence; object codescan be no-multilevel mark code or multilevel mark code; the processingmethod for coded objects comprises the step of generating the input codefor the data to be processed according to the object input table; formultiple property object, eliminating the confusion of input codes forone object code by multiple property codes; Object inputting or objectsearching for coded input objects comprising the steps of: (A) inputtinginput codes for objects to be inputted or to be searched; (B) generatinginput codes of object codes in related data according to object inputcode table; (C) getting the inputted objects or searched objects bycomparing the inputted input codes with the generated input codes; Theoutput method of object input codes comprises of the steps of thefollowing: Generating input codes of object codes in related dataaccording to object input code table; In the object processing above,multiple input code tables can be used, may contain one or multiple dataresource; In the comparison, if too many objects matched, input theinput codes related to another input method, and then select the resultmatched input codes related to multiple input methods; (8) Coded inputobjects cascade input apparatus consist of: a compound input code table,which contains the relation between objects and input codes for two ormore input methods; or multiple input code tables, each of which relatesto one input method; cascade input method realize object or object groupinputting by inputting input codes of two or more input method, cascadedinputting comprising the steps of: (A) inputting input codes of oneclass; (B) comparing the inputted codes with related class of inputcodes in the compound input code table or with input codes of relatedsingle input code table; (C) go to (E) if the satisfied resultsselected, then; (D) if too many objects or object groups to be selected,inputting input codes of another input method, then selecting thesatisfied results; (E) retrieving the satisfied results; (9) multilevelmark apparatus: consisting at least one of the following apparatus:Right A multilevel mark apparatus: the rightest object in an objectgroup marked with A mark; the other N objects in the object group with Bmark; A Mark is called as group representative; Left A multilevel markapparatus: the leftest object in an object group marked with A mark; theother N objects in the object group with B mark; A Mark is called asgroup representative; Right B multilevel mark apparatus: the rightestobject in an object group marked with B mark; the other N objects in theobject group with A mark; B Mark is called as group representative; LeftB multilevel mark apparatus: the leftest object in an object groupmarked with B mark; the other N objects in the object group with A mark;B Mark is called as group representative; The said N is positive integeror 0.